Feature Location in a Collection of Product Variants: Combining Information Retrieval and Hierarchical Clustering

نویسندگان

  • Hamzeh Eyal Salman
  • Abdelhak-Djamel Seriai
  • Christophe Dony
چکیده

Locating source code elements relevant to a given feature is an important step in the process of re-engineering software variants, developed by an ad-hoc reuse technique, into a Software Product Line (SPL) for systematic reuse. Existing works on using Information Retrieval (IR) techniques do not consider the abstraction gap between feature and source code levels. In our recent work, we have improved the effectiveness of IR-based feature location by introducing an intermediate level between feature and source code levels, called “code-topics”. We used Formal Concept Analysis (FCA) to identify such “code-topics” . In this paper, we investigate the results of using Agglomerative Hierarchical Clustering (AHC) algorithm to identify codetopics. In our experimental evaluation, we show that AHC significantly increases the recall of feature location with a minor decrease of precision compared to FCA.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ارائه یک الگوریتم خوشه بندی برای داده های دسته ای با ترکیب معیارها

Clustering is one of the main techniques in data mining. Clustering is a process that classifies data set into groups. In clustering, the data in a cluster are the closest to each other and the data in two different clusters have the most difference. Clustering algorithms are divided into two categories according to the type of data: Clustering algorithms for numerical data and clustering algor...

متن کامل

A Pattern Thesaurus for Browsing Large Aerial Photographs

A texture based image retrieval system for browsing large-scale aerial photographs is presented. The salient components of this system include texture feature extraction, image segmentation and grouping, learning similarity measure, and a texture thesaurus model for fast search and indexing. The texture features are computed by filtering the image with a bank of Gabor filters. This is followed ...

متن کامل

یک روش مبتنی بر خوشه‌بندی سلسله‌مراتبی تقسیم‌کننده جهت شاخص‌گذاری اطلاعات تصویری

It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...

متن کامل

Assessment of the Performance of Clustering Algorithms in the Extraction of Similar Trajectories

In recent years, the tremendous and increasing growth of spatial trajectory data and the necessity of processing and extraction of useful information and meaningful patterns have led to the fact that many researchers have been attracted to the field of spatio-temporal trajectory clustering. The process and analysis of these trajectories have resulted in the extraction of useful information whic...

متن کامل

Developing a Course Recommender by Combining Clustering and Fuzzy Association Rules

Each semester, students go through the process of selecting appropriate courses. It is difficult to find information about each course and ultimately make decisions. The objective of this paper is to design a course recommender model which takes student characteristics into account to recommend appropriate courses. The model uses clustering to identify students with similar interests and skills...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014